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Introduction 


(S//REL  TO  USA,  FVEY)  This  short-term  study  overviews  and  documents  key  elements  of  the  co-traveler 
analytics  both  under  development  and  operational  at  NSA.  Each  section  includes  a brief  description  of 
the  analytic,  its  status,  source  data,  and  caveats. 

(S//REL  TO  USA,  FVEY)  While  each  analytic  was  designed  to  operate  on  a particular  type  of  data  or  a 
particular  data  format,  many  can  likely  be  scaled  to  operate  on  other  data  sources.  For  instance, 
analytics  designed  for  DNR  GCID  or  VLR  data  might  also  apply  to  DNI  Geolocation  data. 

(S//REL  TO  USA,  FVEY)  The  process  of  documenting  these  analytics  raised  a series  of  important  issues 
that  not  only  distinguish  the  analytics  from  each  other,  but  more  importantly,  shape  the  landscape 
that  we  must  consider  in  moving  forward  to  meet  the  analytic  needs  at  NSA.  Some  of  these  issues  are 
discussed  in  the  next  section. 


Issues  and  Questions 

Should  a co-travel  analytic  consider  where  a GCID  or  VLR  is  physically  located? 

o Many  GSM  analytics  use  GCID  information  to  identify  co-travelers.  If  two  selectors  are  seen 
at  the  same  GCID  around  the  same  time,  they  are  considered  co-travel  candidates.  The 
analytic  does  not  need  to  know  where  the  GCID  is  physically  located.  However,  if  the 
individuals  are  using  different  network  providers  (e.g.,  T-Mobile  and  Verizon),  they  may  be 
physically  standing  next  to  each  other  as  their  mobiles  register  with  different  cell  towers. 
Co-travel  analytics  that  do  not  consider  the  physical  geo-locations  of  the  towers  will  not 
discover  individuals  that  are  co-traveling  on  different  networks, 
o Analytics  that  make  use  of  point  data  (e.g.,  Thuraya)  necessarily  need  to  consider 
geolocational  data  in  order  to  determine  distance  from  one  point  to  another. 

Should  incidental  co-travelers  be  considered? 

o There  is  a difference  between  incidental  co-travel  due  to  collective  movement  (individuals 
with  similar  travel  behaviors  but  no  other  similarities)  and  functional  group-based  co-travel 
among  individuals  with  behaviorally  relevant  relationships.  CTCOP  makes  this  definition 
explicit,  but  warns  that  we  might  not  want  to  exclude  seemingly  incidental  co-travelers 
simply  because  we  are  unaware  of  their  relationship, 
o Other  factors,  such  as  contact  chains  and  target  COMSEC  behaviors  (frequent  power-down, 
handset  swapping,  SMS  behavior),  might  assist  in  determining  whether  co-travelers  are 
associated  through  their  travel  behaviors  alone  or  through  behaviorally  relevant 
relationships. 
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Should  geography  play  a role  in  co-travel? 

o Because  it  is  difficult  to  know  where  a GSM  target  is  located  within  a GCID  or  VLR,  many  of 
the  GSM  co-travel  analytics  use  the  mathematical  central  point  in  the  VLR  or  GCID  as  a 
reference  point.  We  could  postulate  that  traveling  targets  will  be  located  along  roads,  train 
tracks,  or  footpaths  where  network  service  exists.  This  type  of  geographical  information 
could  theoretically  be  used  to  inform  a co-traveler  analytic  in  identifying  candidates 
(especially  those  that  are  traveling  via  the  same  means  of  transportation).  Geographical 
information  might  also  be  used  to  "fill  in  the  gaps"  when  data  is  missing  between  locations 
that  a target  visited. 

o Analytics  in  this  study  that  make  use  of  such  geographical  information  include  DSD's  Co- 
travel analytic  and  the  Geospatial  Analysis  Tradecraft  Center's  (GATC's)  Opportunity  Volume 
analytic. 

Should  device  and  collection  sampling  play  a role  in  determining  co-travelers? 

o We  may  collect  hundreds  of  events  from  one  target's  mobile  phone  while  collecting  only  a 
few  events  from  his  co-traveler's  mobile  phone.  The  number  of  events  collected  may  be  due 
to  collection  bias,  differences  in  network  service,  and/or  target  COMSEC  behavior.  Analytics 
should  take  these  considerations  into  account  when  attempting  to  identify  co-travelers. 

Should  co-travelers  seen  in  different  source  databases  be  considered? 

o Depending  on  a target's  preferred  communication  behaviors,  some  co-travelers  may  be 
seen  largely  in  DNR  GSM  data,  and  other  co-travelers  may  be  seen  largely  in  DNI  data.  We 
may  be  able  to  construct  a more  complete  picture  of  a target's  locations  over  time  if  we 
combine  DNR  and  DNI  data  sources.  It  might  be  worth  considering  the  degree  to  which 
considering  multiple  data  sources  will  significantly  increase  the  number  of  false  positives. 

o Databases  that  do  not  contain  geolocation  information  might  also  be  considered.  For 
instance,  air  travelers  on  the  same  reservation  number  are  probably  co-traveling  on  the 
same  flight.  Users  sharing  a MAC  address  are  probably  co-located  using  the  same  device 
even  though  we  may  not  know  where  that  device  is  located.  Consistent  observations  of 
devices  within  the  same  LAIC  may  provide  evidence  of  co-location,  even  if  the  LAIC's 
physical  service  area  is  unknown.  Finally,  similarities  between  IP  addresses  may  indicate 
proximity  on  the  same  LAN,  even  if  the  physical  location  of  the  LAN  nodes  is  unknown. 

o The  one  analytic  in  this  study  that  attempts  to  combine  multiple  sources  of  information  to 
build  a more  holistic  picture  of  a target's  travel  pattern  is  the  TAC/Cafe/TMAC  Co-travel 
analytic. 

Can  co-travel  be  considered  a series  of  meetings? 

o We  attempted  to  limit  this  study  to  targets  co-traveling  through  two  or  more  locations 
within  an  analyst-specified  time  and  space  window.  If  those  locations  are  defined,  however, 
we  might  consider  co-travel  as  a series  of  "meetings"  at  known  locations.  Analytics  that 
detect  co-location  may  be  different  in  nature  from  those  that  detect  co-travel.  The  specific 
analytic  need  will  define  which  of  these  approaches  is  more  appropriate  and  efficient. 
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o In  this  study,  examples  of  meeting  analytics  that  detect  instances  of  co-location  include  the 
GATC  Opportunity  Volume  Analytic  and  the^^^  Meet&Greet  Spatial  Chaining  Analytic. 

Analytics 


CHALKFUN 


Background 

(TS//SI/REL  TO  USA,  FVEY)  Chalkfun's  Co-Travel  analytic  computes  the  date,  time,  and  network  location 
of  a mobile  phone  over  a given  time  period,  and  then  looks  for  other  mobile  phones  that  were  seen  in 
the  same  network  locations  around  a one  hour  time  window.  When  a selector  was  seen  at  the  same 
location  (e.g.,  VLR)  during  the  time  window,  the  algorithm  will  reduce  processing  time  by  choosing  a few 
events  to  match  over  the  time  period.  Chalkfun  is  SPCMA  enabled1. 

(S//SI/REL  TO  USA,  FVEY)  Note:  As  of  6 September  2012,  the  events  that  are  chosen  depend  on  the 
"sampling  method"  chosen  by  the  analyst  (most  active,  most  per  day,  first/last/most,  or 
first/last/spread).  The  "sampling  rate"  specifies  how  many  events  are  chosen  to  match.  As  Chalkfun 
moves  to  the  cloud,  this  option  will  be  discontinued. 

(TS//SI/REL  TO  USA,  FVEY)  The  cloud-based  version  of  Chalkfun  (see  R6  SORTINGLEAD  Co-traveler 
Analytic  section),  which  may  be  released  as  early  as  September  2012,  will  have  a number  of  additional 
features  and  options: 

• The  system  will  run  one  query  (rather  than  separate  queries)  for  all  of  the  IMSIs,  MSISDNs,  VLRs, 
and  GCIDs  that  an  analyst  enters  (as  if  the  selectors  and  areas  of  interest  were  joined  with  an 
"OR").  The  system  currently  runs  separate  queries  for  each,  returning  separate  sets  of  results  for 
each  combination  of  selector  and  areas  of  interest.  The  cloud-based  version  will  also  enable  the 
user  to  set  the  size  of  the  time  window  that  the  analytic  considers,  rather  than  defaulting  to  one 
hour  (as  described  above). 

• The  user  will  be  able  to  choose  the  countries  or  locations  of  interest.  Blacklist  and  whitelist 
features  will  enable  the  user  to  instruct  the  system  to  ignore  activity  within  a region,  or  restrict 
analysis  to  specified  regions  of  interest  (e.g.,  ignore  activity  in^^^S  or  use  only  activity  from 


• In  considering  potential  co-travelers,  the  analyst  will  have  the  option  to  ignore  activity  in  which 
the  target  is  in  his  home  country 


1 (S//SI//REL)  SPCMA  enables  the  analytic  to  chain  "from,'’  "through,”  or  "to"  communications  metadata  fields 
without  regard  to  the  nationality  or  location  of  the  communicants,  and  users  may  view  those  same 
communications  metadata  fields  in  an  unmasked  form. 
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• The  analyst  will  be  able  to  filter  in  or  out  potential  co-travelers  with  specified  prefixes  (for 

instance,  return  only^^^S  mobiles,  remove  all^^^H  mobiles,  them,  or  include  only  mobiles 
that  are  from  the  same  country  as  the  target). 


Status  and  Summary 


Status 

Source  Data 

Caveats 

- Operational;  Available  at 

- All  FASCIA  data  containing  VLR 

- Current  version  is  not  cloud- 

analysts  desktops 

and  GCID  information 

based  and  can  have  long 

- Cloud  version  could  be 

processing  times,  however 

available  as  early  as  September 

cloud-based  solution  is 

2012. 

imminent. 

- Analytic  will  only  return  co- 
travelers on  the  same  provider 
network 

DSD  Co-Travel  Analytic 


Background 

(S//SI/RELTO  USA,  FVEY)  The  DSD  Co-Travel  analytic  predicts  target  locations  and  co-travelers  by 
calculating  time-based  travel  trajectories.  Probable  travel  routes  are  calculated  using  observed  locations 
and  determining  the  most  likely  paths  and  travel  times  similarto  that  used  in  turn-by-turn  navigation 
systems.  These  target  travel  paths  are  represented  as  a series  of  LAT/LONG  waypoints  or  line  segments 
along  the  probable  travel  routes,  such  as  roads.  The  travel  paths  are  divided  into  segments  (e.g.  20  to 
50km  along  the  road).  The  analytic  predicts  the  approximate  time  that  the  target  would  theoretically 
arrive  at  each  segment  waypoint  based  on  projected  travel  times  between  known  locations.  Then, 
within  the  travel  window,  the  analytic  discovers  candidate  co-travellers  that  intersect  locations  along 
the  buffered  travel  path.  The  next  step  in  the  analytic  is  performed  using  interactive  Renoir  analysis  of  a 
two  mode  graph  representing  the  route  segments  and  selectors  observed  on  these  route  segments 
within  the  time  windows.  Once  the  data  is  clean  and  candidate  co-travellers  are  identified  detailed 
analysis  can  be  done  in  Renoir  or  other  tools  such  as  GeoTime  incorporating  other  supporting  data  such 
as  communications  events  and  content. 

(S//SI/REL  TO  USA,  FVEY)  The  analytic  currently  runs  on  a Netezza-based  architecture,  called  Hectic 
Snare,  that  rapidly  executes  MySQL-based  QFDs.  This  architecture  enables  interactive  exploratory 
analysis  and  rapid  pattern  matching.  The  analytic  is  distributable  and  could  be  implemented  in 
Hadoop/MapReduce  or  Accumulo. 

(S//SI/REL  TO  USA,  FVEY)  This  analytic  was  tested  using  an^^^^^H  terrorist  case  study.  The  case 
study  used  approximately  80,000  base  stations  locations  and  16  billion  mobiles  location  records  for 
CDRs  (Call  detail  records)  and  infrastructure  collect  from  DRT  and  Juggernaut  systems.  This  case  study 
showed  that  more  candidate  co-travellers  were  discovered  by  analyzing  the  travel  paths  than  by 
considering  common  meeting  locations  alone. 
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Status  and  Summary 


Status 

Source  Data 

Caveats 

Analytic  implemented  and 

- Mobile  CDRs  and  residing  in 

- Requires  Netezza  (current 

tested  at  DSD. 

Netezza-based  architecture. 

implementation) 

- Requires  Renoir 

Future  Work 

(S//SI/REL  TO  USA,  FVEY)  DSD  would  like  to  integrate  key  meeting  locations  into  this  analytic,  such  as 
safehouses.  Plans  are  also  underway  to  identify  targets  based  on  COMSEC  behaviors  such  as  identifying 
mobiles  that  are  turned  off  right  before  convergence  between  two  travel  paths  occurs. 


Geospatial  Analysis  Tradecraft  Center  (GATC)  Opportunity  Volume 
Analytic 


Background 

(TS//SI/REL  TO  USA,  FVEY)  The  opportunity  volume  analytic  determines  whether  two  entities  (e.g. 
devices)  could  have  been  co-located  by  considering  the  possibility  of  their  travel  paths  intersecting.  The 
opportunity  volume  analytic  requires  pairs  of  event  locations  and  times  for  each  entity,  and  computes 
the  possible  locations  and  times  in  which  the  two  entities  could  have  been  co-located.  It  does  this  by 
computing  possible  travel  route  surfaces  for  each  entity  between  the  specified  events,  using  a travel 
cost  surface  computed  from  terrain,  land  cover,  and  road  network  data.  These  possible  travel  route 
surfaces  include  the  temporal  dimension  (that  is,  the  period  of  time  in  which  the  entity  could  have  been 
at  the  given  location);  the  intersection  between  these  multidimensional  surfaces  represents  the  places 
and  times  during  which  the  entities  could  have  been  co-located.  The  analytic  was  developed  using  GPS 
point  event  data,  but  the  analytic  actually  uses  a 1-km  grid  for  the  spatial  resolution  and  a 15-minute 
period  for  the  temporal  resolution,  so  it  can  be  applied  to  any  data  that  can  be  expressed  in  these 
terms. 


Status  and  Summary 

Status 

Source  Data 

Caveats 

Prototype  service  implemented 
on  NGANet.  Not  yet  ported  to 
NSANet. 

- Geohashes  of  GPS  point  event 
data. 

- Requires  event  locations  and 
times  for  every  selector. 

- Designed  for  1 km  grid-based 

locations  and  15  minute  time 
intervals. 

- Co-travel  capability  would 
require  analyst  to  define  a series 
of  meetings  at  specified 
locations. 
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Future  Work 

(TS//SI/REL  TO  USA,  FVEY)  The  purpose  of  this  service  is  to  determine  whether  two  entities  could  have 
been  co-located  given  observed  event  locations  for  those  entities.  To  detect  co-travel,  the  analyst  would 
need  to  define  a series  of  meeting  locations  and  times.  The  opportunity  volume  analytic  could  also 
provide  a mechanism  for  vetting  co-travel  analytics  by  testing  for  possible  co-location  events  along  co- 
travel routes. 


TMI  Co-Traveler  Analytic 


Background 

(TS//SI/REL  TO  USA,  FVEY)  The  Track  Mutual  Information  (TMI)  cloud  analytic 

was  developed  as  a study  under  their  graph  analytics,  alerting,  and  target  development  program.  The 
analytic  is  oriented  to  work  on  7 to  30  days  worth  of  regional  collection.  It  has  been  tested  on  RT-RG 
data  from  the  region.  Instead  of  using  GCID  information  as  co-travel  reference  points,  the 

analytic  works  cross-network  by  computing  target  "closeness"  based  on  the  GCID  Lat/Long  GEO 
information  and  time.  The  Lat/Long  information  is  obtained  from  RT-RG. 

(TS//SI/REL  TO  USA,  FVEY)  The  analytic  starts  by  computing  event  sequences  of  LAT,  LONG,  and  time  for 
each  selector.  These  are  called  "tracks".  It  then  computes  a value  that  measures  how  far  the  selector 
has  traveled  in  general.  If  the  selector  has  not  traveled  outside  a 20  to  50  km  radius,  the  selector  is  not 
considered.  Each  eligible  selector's  tracks  are  pairwise-compared  to  the  others  and  a measure  of 
similarity  in  time  and  space  is  computed. 


Status  and  Summary 


Status 

Source  Data 

Caveats 

Initial  development  completed. 

- Sortinglead  summaries  of 

- Analytic  only  considers  tasked 

In  testing  phase,  not  yet 

FASCIA  data  on  GM-PLACE  and 

selectors  as  seeds. 

operational 

- Analytic  does  not  consider 

- RT-RG  regional  GSM  collection 

targets  that  do  not  travel  outside 

a 20  to  50  km  radius. 

- Track  dataset  must  be 
repopulated  for  each  data 
update 

Future  Work 

(TS//SI/REL  TO  USA,  FVEY)|  would  like  to  reduce  processing  by  creating  an  index  containing  selectors 
whose  tracks  are  near  each  other  in  space.  To  achieve  this,  future  work  may  make  use  of  a GEOAddress 
hashing  algorithm  that  uses  LAT/LONG  information  to  group  cell  towers  into  clusters  that  are  in  the 
same  region.  This  hash  considers  latitude  and  longitude  only,  and  is  agnostic  to  the  targets'  service 
provider.  It  may  be  possible  to  also  compare  target  tracks  quickly  by  comparing  these  GeoAddresses. 
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Co-Traveler  Analytics 


Background 

(TS//SI/RELTO  USA,  FVEY)^^H  has  developed  two  co-travel  analytics:  Fast  Follower  (FF)  and 
Meet&Greet  Spatial  Chaining  (MGSC).  The  FF  analytic  was  initially  designed  to  detect  individuals  who 
are  following  station  personnel.  Detailed  non-SIGINT  path  data  is  collected  consensually  on  the  station 
personnel,  and  this  reference  path  data  provides  the  seeds  for  this  analytic,  which  attempts  to  discover 
mobile  GEO  data  indicating  individuals  that  may  be  following  the  station  personnel.  The  MGSC  analytic 
is  designed  to  detect  meetings  between  high-value  individuals  and  other  entities. 

(TS//SI/REL  TO  USA,  FVEY)  The  FF  analytic  begins  by  considering  non-SIGINT  reference  paths  for  station 
personnel  based  on  detailed  knowledge  of  the  entity's  location.  Candidate  followers  are  determined  by 
identifying  other  individuals  that  have  traversed  some  number  of  consecutive  points  (determined  by  the 
analyst)  that  match  the  reference  path  in  space  and  time.  The  analyst  also  sets  a parameter  to  specify 
the  minimum  distance  that  must  be  covered  along  a candidate  path. 

(S//SI/REL  TO  USA,  FVEY)  The  MGSC  analytic  is  designed  for  ELKPRINTS  data  from  smartphones.  This 
analytic  identifies  sequences  of  consecutive  location  points  close  in  time  and  combines  them  into  a 
single  data  point.  A maximum  velocity  movement  parameter  is  applied  to  create  a time  window  around 
each  point  representing  the  approximate  time  at  which  the  individual  was  located  there  (as  opposed  to 
traveling  to  or  from  that  location).  Finally,  co-travelers  are  identified  by  discovering  pairs  of  selectors 
that  meet  the  duration  and  distance  thresholds  set  by  the  analyst  as  input  parameters.  Spatial  chaining 
software  aggregates  and  presents  the  meeting  data,  including  the  locations,  times,  and  scoring  metrics 
to  the  analyst. 


Status  and  Summary 


Status 

Source  Data 

Caveats 

The  MGSC  analytics  has  been 

- Smartphone  data  from 

- Analytic  designed  for  precise 

tested  on  real  ELKPRINTS  data. 

ELKPRINTS 

geolocation  data  (e.g.,  from 

but  results  have  not  been 

- Reference-path  data  (FF) 

smartphones) 

validated  by  operational 
analysts. 

The  FF  analytic  has  been  tested 
on  made-up  data. 

- List  of  selectors  (MGSC) 

- MGSC  analytic  would  require 
the  analyst  to  define  a series  of 
meetings 

PACT  NGA-NSA  GATC  Analytic 


Background 

(TS//SI/REL  TO  USA,  FVEY)  The  PACT  analytic  is  a joint  NSA-NGA  effort  to  identify  co-traveling  Thuraya 
handsets.  The  effort  was  motivated  by  an  increase  in  Thuraya  phone  usage  bv^^H^H^ppii^^i 

SIGINT  Geospatial  Analysts  were  able  to  characterize  the  travel 
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behaviors  of  the  targeted  Thuraya  handsets  and  identifying  other  handsets  with  similar  patterns.  The 
targeted  handsets  were  observed  traveling  between  known^^|  government  and  military 
installations;  therefore,  handsets  with  similar  travel  behaviors  were  inferred  to  be^^^  government 
forces. 

(TS//SI/REL  TO  USA,  FVEY)  The  first  step  of  PACT  is  to  identify  a set  of  waypoints  for  each  target  handset. 
Waypoints  are  generated  from  sequences  of  events  that  cluster  together  in  space  and  time.  The  second 
step  is  to  identify  which  pairs  of  handsets  contain  similar  waypoint  clusters.  Pairs  are  scored  based  on 
the  number  of  waypoint  clusters  that  match.  This  analytic  also  considers  the  total  possible  number  of 
waypoint  clusters  for  each  selector,  so  that  the  total  number  of  communication  events  per  selector  is 
taken  into  consideration.  This  process  is  intended  to  reduce  the  possibility  of  producing  results  that 
include  incidental  co-travel.  The  third  step  in  this  analytic  identifies  persistent  patterns  by  examining  the 
time  periods  over  which  co-location  occurs  for  each  co-travel  candidate  pair. 


Status  and  Summary 


Status 

Source  Data 

Caveats 

Tested  on  VOICESAIL  data  from 
CULTWEAVE.  Patterns  stored  in 
QFD. 

In  process  of  transitioning  PACT 
to  NSA/S2. 

- Thuraya  data  from  CULTWEAVE 
(~500  M waypoints  in 
CULTWEAVE) 

- Analytic  designed  for  Thuraya 
or  other  point  data 

Future  Work 

Future  work  could  involve  applying  this  analytic  to  other  types  of  QFD  datasets  such  as  Inmarsat  and 
GSM  data.  The  team  is  also  interested  in  building  on  this  analytic  to  enable  discovery  of  asynchronous 
co-traveling  relationships. 


R6  SORTINGLEAD  Co-Traveler  Analytic 


Background 

(S//RELTO  USA,  FVEY)  R6  has  been  partnering  with  Chalkfun  to  upgrade  the  Chalkfun  co-traveler 
analytic  to  a cloud-based  analytic  that  will  run  on  Cloud  14  (to  eventually  be  migrated  to  MDR-2). 

(TS//SI/REL  TO  USA,  FVEY)  The  R6  co-traveler  analytic  accepts  a selector  and  timeframe  as  input,  and 
then  derives  an  itinerary  for  the  selector  that  includes  the  CELL  IDs  and/or  VLRs  (depending  on  what  is 
available).  The  itinerary  is  based  on  a series  of  waypoints  generated  from  the  location  information  that 
is  available  in  FASCIA-PCS.  Then,  the  analytic  searches  for  other  selectors  that  were  "near"  these 
waypoints  in  space  and  time.  Time  windows  are  configurable  and  can  be  adjusted  by  the  user.  Each 
candidate  is  scored  and  then  prioritized  based  on  the  scores. 

(TS//SI/REL  TO  USA,  FVEY)  The  R6  co-traveler  analytic  operates  on  Sortinglead  Event  Summaries  and  a 
GEO  Index.  The  Sortinglead  Event  Summaries  provide  rapid  access  to  FASCIA  PCS  events  by  summarizing 
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and  enriching  key  elements  of  selector  behavior.  The  Sortinglead  Event  Summaries  benefit  this  analytic 
because  they  can  provide  enriched  location  information  about  selectors  that  is  not  present  in  the  raw 
metadata.  The  GEO  Index  contains  a mapping  between  the  locations  (GCIDs  or  VLRs)  visited  by  a 
selector  and  the  time  (day/minute)  that  the  visit(s)  occurred.  Information  from  command  and  control 
networks  that  track  IED  attacks  is  also  used  to  enrich  the  GEO  Index. 

(TS//SI/REL  TO  USA,  FVEY)  The  results  that  can  be  returned  from  this  type  of  analytic  can  potentially  be 
enormous.  Each  candidate  will  have  some  level  of  time  and  space  overlap  with  the  seed.  Prioritization 
occurs  by  assessing  the  quality  of  the  overlap  in  terms  of  time  and  space  closeness.  The  analyst  may 
choose  to  triage  any  number  of  potential  candidates  (e.g.  top  10  or  top  100  candidates,  or  candidates 
that  surpass  a given  threshold). 


Status  and  Summary 


Status  Source  Data  Caveats 


- In  testing  phase  to  be 
replacement  back-end  for  the 
current  production  CHALKFUN 
co-traveler  tool 

- Cloud-based  (MapReduce) 
implementation  under 
development  to  handle  larger 
numbers  of  queries 
simultaneously 


- FASCIA  PCS  Sortinglead 
Summaries 

- CHALKFUN  enrichment  (VLR 
country  mapping) 


- Analytic  cannot  recover  cross- 
network co-travelers 

- Analytic  will  not  be  effective 
against  stationary  (non-traveling) 
targets 

- Processing  is  memory  intensive 

- Analytic  is  sensitive  to  large 
cells,  VLRs,  and  dense  areas 

- Not  directly  applicable  to  sat 
phones  with  LAT/LONG 
information 

- Results  can  be  very  sensitive  to 
timeframe  chosen  as  input.  For 
instance,  analytic  will  not  be 
effective  for  large  queries  across 
multiple  countries  and  large  time 
frames  (e.g.,  anywhere  inBUcj 
over  the  past  year  and  then 
anywhere  in|H^|). 


Future  Work 

(TS//SI/REL  TO  USA,  FVEY)  Because  the  R6  co-traveler  analytic  depends  on  GCID  and  VLR  locations  as 
meeting  points  or  waypoints,  it  will  not  return  selectors  that  co-travel  on  different  provider  networks. 
(For  instance,  it  could  not  return  a Verizon  selector  co-traveling  with  a T-Mobile  selector.)  The  R6  team 
is  working  on  experiments  that  might  "alias"  seed  selectors  to  nearby  selectors  on  other  networks  to  get 
around  this  problem,  but  this  poses  challenges.  The  RT-RG  analytic  (discussed  later  in  this  paper)  uses 
relative  velocities  to  deal  with  the  cross-network  challenge,  but  this  approach  requires  pre-computing 
travel  behavior  for  all  pairs  of  selectors,  which  can  be  computationally  expensive. 
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RT-RG  Sidekicks 


Background 

(TS//SI/REL  TO  USA,  FVEY)  The  RT-RG  Sidekick  Cloud-Based  Co-traveler  analytic  compares  average  travel 
velocity  between  pairs  of  selectors  to  infer  whether  or  not  could  co-travel  would  practically  be  possible. 
The  velocity  factor  is  intended  to  reduce  the  number  of  false  positives  when  considering  travel  among 
urban  areas  by  filtering  out  pairs  of  selectors  that  were  seen  at  the  same  series  of  CELL  IDs  or  VLRs  over 
time,  but  could  not  have  been  traveling  together  because  the  location  data  timestamps  presuppose  an 
unreasonable  velocity.  This  may  happen  because  one  or  both  of  the  selectors  in  the  pair  may  have  been 
located  at  the  edges  of  the  network  coverage  during  one  or  more  of  their  travel  midpoints. 

(TS//SI/REL  TO  USA,  FVEY)  The  analytic  first  computes  ''movement  summaries"  of  all  available  tasked 
selectors.  The  movement  summaries  contain  a list  of  locations  that  a target  visited  during  the  timeframe 
of  interest,  given  by  the  analyst.  Locations  are  defined  by  CELL  IDs  (for  GSM)  or  GEO-Hashes  (for|^Q 
any  other  selectors  with  Lat/Long).  Then,  the  system  discovers  pairs  of  targets  that  could  be  traveling 
together  by  comparing  their  sequences  of  physical  locations  and  factoring  out  pairs  that  could  not  have 
reasonably  arrived  at  the  meeting  waypoints  within  10  minutes  of  each  other. 

(TS//SI/REL  TO  USA,  FVEY)  One  of  the  main  benefits  of  the  RT-RG  Sidekicks  analytic  is  that  it  is  not 
constrained  by  provider  network.  Because  it  considers  physical  (LAT/LONG)  locations  and  travel 
velocities,  it  can  provide  co-traveler  results  that  include  selectors  on  different  provider  networks. 


Status  and  Summary 


Status  Source  Data  Caveats 


- QFD  available  at  RT-RG  analyst 
desktop. 

RT-RG  Tools:  Goldminer,  CHET, 
GEOT 


- Sortinglead  Event  Summaries 


of  Fascia  PCS) 

- Currently  running  on  RT-RG 


- Could  possibly  scale  to  FASCIA 
event  summaries 


- Requires  accurate  tower  geo 
data  (location  and  date) 

- Requires  pre-computing  all 
selectors  against  all  selectors, 
which  can  be  expensive 

- Current  output  includes  only 
tasked  selectors 

- Analytic  is  not  designed  for 
stationary  targets. 


Future  Work 

(TS//SI/RELTO  USA,  FVEY)  Currently,  the  system  is  integrated  with  RT-RG,  operating  onBBprfjppj 
GSM  data.  It  may  scale  to  a larger  data  source;  however,  it  is  designed  to  precompute  sidekicks  for  each 
possible  pair  or  tasked  selectors. 

(TS//SI/REL  TO  USA,  FVEY)  This  analytic  could  also  be  applied  to  DNI  location  data. 
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Scalable  Analytics  Tradecraft  Center  (SATC)  Geospatial  Lifelines  Co- 
Travel  QFD 


Background 

(TS//SI/REL  TO  USA,  FVEY)  The  geospatial  lifelines  QFD  applies  the  concept  of  "dwell  times"  to  identify 
DNR  co-travelers.  Dwell  times  describe  the  time  period  spent  at  the  beginning  or  ending  destination.  A 
location  is  considered  a beginning  or  ending  location  if  the  dwell  time  at  that  location  is  greater  than  2 
hours. 

(TS//SI/REL  TO  USA,  FVEY)  This  QFD  first  generates  geohashes  using  GSM  event  data,  and  then 
calculates  transition  lines  indicating  that  a device  traveled  from  one  geohash  to  another.  The  result  is  a 
graph  in  which  the  geohashes  represent  nodes  and  the  transitions  represent  links  or  edges.  Clustering 
algorithms  are  applied  to  the  graphs  to  determine  locations  and  selectors  of  interest. 

(TS//SI/REL  TO  USA,  FVEY)  The  geospatial  lifelines  represent  the  beginning  and  ending  locations,  as 
defined  by  their  dwell  times,  and  all  other  intermediate  observations.  The  likeliness  of  co-travel  along 
paths  between  starting  and  destination  points  is  based  on  the  following  measurements:  net  distance, 
time  of  transition  (mins),  speed  (kph),  Azimuth,  and  number  of  travel  segments. 


Status  and  Summary 


Status 

Source  Data 

Caveats 

Analytic  tested  on  90  days  of 

- Geohashes  of  GSM  event  data 

- Analytic  designed  for  GSM 

GSM  event  data  from^F^'lLSI 

retrieved  from  FASCIA. 

data,  but  could  be  applied  to 

other  types  of  data 

Code  is  available  through  SATC, 

- Oriented  to  targets  that  remain 

but  analytic  is  no  longer  under 

in  one  location  for  at  least  2 

development. 

hours 

- Requires  Geocoded  source  data 
for  generating  Geohashes 

Future  Work 

(S//RELTO  USA,  FVEY)  The  code  for  this  QFD  is  available  through  SATC,  but  the  analytic  is  no  longer 
under  development.  Ideas  for  future  work  before  the  project  ended  included  adding  acceleration  and 
sinuosity  to  the  computation. 


SSG  Common  IMSIs  Analytic 


Background 

(S//SI//REL)  The  Common  IMSIs  Analytic  is  a model  in  SEDB  JEMA  finds  SIM  card  activity  seen  on  cell 
tower  panels  in  multiple  areas  (e.g.-  border  crossings  commonly  used  by  traffickers).  It  makes  use  of  the 
Tower  QFD. 
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(S//SI//REL)  Analyst  inputs  areas  of  interest  and  time  range.  The  analytic  returns  an  excel  file  with  a list 
of  IMSIs  seen  in  those  areas  at  that  time.  It  is  enriched  with  OCTAVE  tasking  information.  Limitations  are 
that  tower  locations  in  OCTSKYWARD  can  be  imprecise.  Also,  the  SEDB  Tower  QFD  summarizes  IMSIs  by 
LAIC  by  day.  Summaries  by  MSISDN  or  IMEI  are  not  available. 


Status  and  Summary 


Status 

Source  Data 

Caveats 

Available  in  JEMA. 

-OCTAVE  and  FASCIA 

- Cell  tower  locations  in 
OCTSKYWARD  can  be  imprecise. 

- The  SEDB  Tower  QFD 
summarizes  IMSIs  by  LAIC  by 
day. 

- Summaries  by  MSISDN  or  IMEI 
are  not  available. 

Additional  Information 

https://wiki.nsa.ic.gov/wiki/Analytics  Taxonomy 


https://wiki.nsa.ic.gov/wiki/DNR  Travel  Pattern 


Target  Analysis  Center  (TAC)/Cafe/  Travel  and  Mobility  Analysis  Center 
(TMAC)  DNI  Co-Travel  Analytic 

Cafe  Spin  1 (October  2011  - January  2012) 


Background 

(TS//SI/REL  TO  USA,  FVEY)  The  Cafe  project  involved  TMAC,  SSG,  T1212,  and  S2I5  working  in  concert  to 
develop  both  DNI  and  DNR  cloud-based  travel  analytics.  The  absence  of  a cloud-based  solution  that 
could  run  over  bulk  data  motivated  this  initiative.  The  Cafe  objective  was  to  steer  cloud  travel  analytics 
toward  operational  use  and  ultimately  merge  the  DNI  and  DNR  analytics  in  a unified  co-travel  analytic. 
These  analytics  are  currently  still  under  development;  however,  they  are  available  to  the  development 
community  on  GM-PLACE. 

(TS//SI/RELTO  USA,  FVEY)  This  analytic  uses  IP  geolocation  of  active  user/presence  events  as  travel 
indication. 

(TS//SI/REL  TO  USA,  FVEY)  The  DNI  analytic  operates  in  one  of  two  modes.  The  first  mode  accepts  a list 
of  tasked  targets  via  UTT,  and  attempts  to  identify  co-travelers  for  those  targets  that  have  been  deemed 
to  have  travelled  during  a specified  time  window  (typically  30  days).  The  analytic  only  considers  targets 
that  traveled  between  at  least  2 countries  in  a given  month.  For  these  traveling  targets,  candidate  co- 
travelers are  scored  based  on  how  many  times  they  were  seen  in  the  same  locations  during  the  same 
times  as  the  target.  Target  locations  are  given  by  DNI  selector  IP  geolocation,  provided  by  ASDF  enriched 
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with  GEO  reference  data  (or  geo-tagging  where  available).  Because  this  data  provides  city-level  location 
resolution,  co-traveler  candidates  are  assigned  scores  based  on  the  extent  to  which  they  were  seen  in 
the  same  cities  and  on  the  same  days  as  targets. 

(TS//SI/REL  TO  USA,  FVEY)  The  second  mode  accepts  a pattern  representing  target  travel  across 
spanning  countries  of  interest  (e.g.,^^^^^^^2),  and  optionally,  the  days  on  which  the  countries 
were  visited.  In  this  mode,  the  TAC/Cafe/TMAC  DNI  Co-travel  analytic  in  this  mode  identifies  travelers 
that  (at  minimum)  match  the  pattern.  All  candidates  that  match  the  pattern  are  regarded  as  possible 
co-travelers. 

(S//REL  TO  USA,  FVEY)  The  result  of  these  analytics  is  a QFD  monthly  roll-up  that  can  be  queried. 


Status  and  Summary 


Status 

Source  Data 

Caveats 

Available  to  developers  with 
access  to  Ghostmachine  (GM- 
PLACE) 

-Tasked  DNI  selectors  (UTT) 

- Geotagged  ASDF  data 

- User-provided  travel  patterns 

- Tasked  targets  or  travel 
patterns  provided  as  input; 
results  include  tasked  and 
untasked  targets 

- Analytic  operates  at  the 
country  level  to  determine 
travel/city  level  for  co-traveler 
determination,  and  designed  to 
provide  monthly  QFD  roll-up 

- Proxies  and  other  shared  IP 
settings  can  render  IP 
geolocation  susceptible 

Future  Work 

(S//SI/REL  TO  USA,  FVEY)  The  TAC/Cafe/TMAC  DNI  Co-traveler  team  also  considered  capabilities  to 
enable  follow-on  queries  utilizing  CHALKFUN  for  convergence  efforts  to  identify  roaming  handsets  as 
possible  DNI  target  co-travelers. 

Other  resources 

https://ncmd-satcOl.ncmd.nsa.ic.g0v/gamblt/public/q/dni  travel  analytic  cloud  version 


https://wiki.nsa.ic.gov/wiki/Cafetravel  dni  co-travelers 


TAC/Cafe/TMAC  DNR  Co-Traveler  Analytic 

Cafe  Spin  2 (January -July 2012) 
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Background 

(TS//SI/REL  TO  USA,  FVEY)  The  Cafe  project  involved  TMAC,  SSG,  T1212,  and  S2I5  working  in  concert  to 
develop  both  DNI  and  DNR  cloud-based  travel  analytics.  The  absence  of  a cloud-based  solution  that 
could  run  over  bulk  data  motivated  this  initiative.  The  Cafe  objective  was  to  merge  the  DNI  and  DNR 
analytics  to  create  one  complete  co-travel  analytic;  however  the  DNR  co-traveler  analytic,  described 
below,  is  currently  still  under  development. 

(TS//SI/REL  TO  USA,  FVEY)  The  DNR  cloud-based  analytic  considers  all  known  targets  (tasked  in  OCTAVE) 
that  have  traveled  within  a given  date  range  (e.g.,  monthly  roll-up  to  five  month  range),  and  attempts  to 
find  their  co-travelers.  Co-travelers  are  defined  as  individuals  that  were  seen  in  the  same  area  (currently 
defined  by  VLRs)  around  the  same  time  as  the  targets.  The  output  includes  both  tasked  and  untasked 
selectors  as  possible  co-travelers  with  the  tasked  seeds.  Each  possible  co-traveler  is  assigned  a score 
that  indicates  the  probability  of  co-travel  with  the  seed.  Higher  scores  are  assigned  to  co-travelers  that 
are  seen  at  more  of  the  same  locations  and  closer  in  time  (pairs  are  given  one  point  if  seen  within  one 
hour,  and  a half  point  if  seen  within  two  hours  of  each  other). 


Status  and  Summary 


Status 

Source  Data 

Caveats 

Analytic  has  been  tested  on 

- FASCIA  data  on  GM-PLACE 

- Analytic  only  considers  tasked 

FASCIA  data  on  GM-PLACE 

- ~40B  rows  in  the  GM  PLACE 

selectors  as  seeds 

CLOUDBASE  table 

- Source  data  provided  by  VLRs 

Command  line  interface 

- CHALKFUN  Enrichment  (VLR 

- Co-travel  events  are  rolled-up 

available  to  developers 

Country  mapping) 

- CLOUDBASE  Events  (IMSIJMEI) 
rounded  to  nearest  hour 

by  the  hour 

Future  Work 

(S//SI/REL  TO  USA,  FVEY)  Follow-on  analysis  could  take  advantage  of  FASTSCOPE  reservation  number 
feature  which  will  return  all  co-travelers  that  travel  on  the  same  reservation  number  within  a given  time 
period  (because  reservation  numbers  are  reused,  a specific  timeframe  must  be  provided). 

Other  Resources 

https://wiki.nsa.ic.gov/wiki/DNR  Traveler 
https://wiki.nsa.ic.gov/wiki/DNR  Co-Traveler 
https://wiki.nsa.ic.gov/wiki/DNR  Travel  Pattern 


DNR  Co-Traveler  Manual  Analysis 

Taken  from:  https://ncmd- 

satcOl.ncmd.nsa.ic.gov/gambit/public/q/dnr  co  travel  based  on  similiar  cell  ids  over  a time  frame 
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1.  Start  with  a target  selector  (e.g.  IMSI) 

2.  Query  the  target  selector  for  PCS  events  to  identify  cell  towers  this  target  his  hitting  off  of  and 
at  what  date/time. 

3.  Note  the  cell  towers,  location  of  the  cell  towers,  and  the  date/times 

4.  Query  those  cell  towers  (and  other  cell  towers  in  the  area)  for  those  dates  and  times  to 
identify  other  users  who  are  hitting  off  of  those  towers 

5.  Compare  the  results  of  the  users  hitting  off  of  the  cell  towers. 

6.  Rank  the  selectors  as  being  possible  candidates  for  co-travelers  based  on  what  cell  towers 
they  hit  on  at  the  right  times. 

7.  Selectors  that  are  reliably  seen  to  be  hitting  off  of  the  same  towers  at  the  same  times  more 
than  others  should  get  a higher  rank. 


Summary 

(S//SI/REL  TO  USA,  FVEY)  At  the  beginning  of  this  paper,  we  presented  a number  of  key  issues  and 
questions.  Many  of  the  analytics  define  themselves  by  (1)  the  key  issues  they  address  in  novel  ways  and 
(2)  the  types  of  source  data  on  which  they  operate. 

(S//SI/REL  TO  USA,  FVEY)  The  key  issues  section  highlights  capabilities  that  might  improve  the  accuracy 
of  the  analytic  results.  For  example,  analytics  that  have  knowledge  about  the  locations  of  GCIDs  and 
VLRs  and  can  augment  their  procedures  with  non-SIGINT  data  such  as  geographic  and  terrestrial  data. 
This  information  contains  knowledge  about  the  locations  of  highways  and  roads.  Analytics  that  can 
geographically  validate  routes  between  meeting  points  can  then  use  this  information  to  constrain  the 
possible  co-travel  routes  and  candidate  co-travel  selectors  along  those  routes. 

(S//SI/REL  TO  USA,  FVEY)  Analytics  that  can  operate  on  a variety  of  different  source  data  formats, 
including  both  DNI  and  DNR,  benefit  from  the  ability  to  exploit  divergent  data  sources  to  develop  more 
complete  pictures  of  target  travel  behavior. 

(S//SI/REL  TO  USA,  FVEY)  The  co-travel  analytics  in  this  study  are  at  various  stages  of  development, 
testing,  and  deployment.  One  possible  way  forward  could  be  to  have  an  independent  organization2 
perform  a formal  evaluation  of  these  analytics  using  a common  test  dataset.  This  would  enable  a fair 
comparison  and  assessment  of  the  analytics'  processing  time,  efficiency,  and  accuracy.  Understanding 
the  advantages  and  challenges  of  each  analytic  against  a common  test  dataset  with  ground  truth  may 
facilitate  planning  for  future  work. 


2 An  independent  organization  is  one  that  is  not  involved  in  the  development  of  any  of  these  analytics  and  that 
does  not  have  a stake  in  the  outcome. 
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Support  Center  for  their  contributions  to  the  section  on  Issues  and  Questions. 
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Summary  Table  of  Co-Travel  Analytics 


Name  of  Analytic 

Summary 

Source  Data 

Architecture 

Status 

Caveats 

Analytic  computes  the  date, 

-All  FASCIA  data 

- Cloud- 

- Operational; 

- Current  version  is  not  cloud- 

time,  and  network  location  of 

containing  VLR 

based 

Available  at 

based  and  can  have  long 

CHALKFUN 

any  (tasked  or  untasked) 

mobile  phone  over  some  time 
period,  and  then  looks  for 
other  mobile  phones  that 
were  seen  in  the  same 
network  locations  around  a 
one  hour  time  window.  When 
a selector  was  seen  at  the 
same  location  (e.g.,  VLR) 
during  the  time  window,  the 
algorithm  will  reduce 
processing  time  by  choosing  a 
few  events  to  match  over  the 
time  period.  Chalkfun  is 
SPCMA  enabled. 

and  GCID 
information 

version  could 
be  available 
as  early  as 
September 
2012. 

analysts  desktops 

processing  times,  however 
cloud-based  solution  is 
imminent. 

- Analytic  will  only  return  co- 
travelers on  the  same  provider 
network 

DSD  Co-Travel 
Analytic 

Predicts  target  locations  and 
co-travelers  by  calculating 
time-based  travel  trajectories 
and  identifying  likely  path 
intersections  between 
observed  locations.  The 
analytic  calculates  travel  times 
at  waypoints  similar  to  that 
used  in  turn-by-turn 
navigation  systems. 

-Mobile  CDRs 

- Netezza 

- Could  be 
implemented 
in  Cloud- 
based 

architecture 
(Hadoop/ 
MapReduce 
or  Accumulo) 

- Implemented  and 
tested  at  DSD 

- Requires  Netezza  (current 
implementation) 

- Requires  Renoir 

Geospatial 

Determines  whether  two 

- Geohashes  of 

Cloud-based 

Prototype  service 

- Requires  event  locations  and 

Analysis 

entities  (e.g.  devices)  could 

GPS  point  event 

implemented  on 

times  for  every  selector. 

Tradecraft  Center 

have  been  co-located  by 

data. 

NGANet.  Not  yet 

- Designed  for  1 km  grid-based 
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Name  of  Analytic 

Summary 

Source  Data 

Architecture 

Status 

Caveats 

(GATCJ 
Opportunity 
Volume  Analytic 

considering  the  possibility  of 
their  travel  paths  intersecting. 
Computes  possible  travel 
routes  for  each  entity 
between  specified  events, 
considering  terrain,  land 
cover,  and  road  network  data. 

parted  to  NSANet. 

locations  and  IS  minute  time 
intervals. 

- Co-travel  capability  would 
require  analyst  to  define  a 
series  of  meetings  at  specified 
locations. 

Co-Traveler 

Analytic 

The  analytic  computes  event 
sequences  of  LAT,  LONG,  and 
time  for  each  tasked  selector. 
These  are  called  "tracks".  Each 
selector's  tracks  are  pairwise- 
compared  to  the  others  and  a 
measure  of  similarity  in  time 
and  space  is  computed. 

The  analytic  works  cross- 
network  by  computing  target 
"closeness"  based  on  the  GCID 
Lat/Long  GEO  information  and 
time. 

- Sartinglead 
summaries  of 
FASCIA  data  on 
GM-PLACE  and 

- RT-RG  regional 
GSM  collection 

Cloud-based 

Initial  development 
completed. 

In  testing  phase, 
not  yet  operational 

- This  cloud  analytic  is 
oriented  to  work  on  7 to  30 
days  worth  of  regional 
collection. 

- Analytic  only  considers 
tasked  selectors  as  seeds. 

- Analytic  does  not  consider 
targets  that  do  not  travel 
outside  a 20  to  50  km  radius. 

- Track  dataset  must  be 
repapulated  for  each  data 
update 

jflBtco- 

Traveler  Analytics 

-The  Fast  Follower  (FF] 
analytic  considers  non-SIGINT 
reference  paths  for  station 
personnel  based  on  detailed 
knowledge  of  the  entity's 
location.  Candidate  followers 
are  determined  by  identifying 
other  individuals  whose  path 
matches  the  reference  path  in 
space  and  time. 

- The  Meet&Greet  Spatial 
Chaining  (MGSC)  analytic 

- Smartphone 
data  from 
ELKPRINTS 

- Reference- 
path  data  (FF] 

- List  of 
selectors 
(MGSC] 

Cloud-based 
Implemented 
in.  Java  and 
ported  to 
MapReduce 

The  MGSC  analytics 
has  been  tested  on 
real  ELKPRINTS 
data,  but  results 
have  not  been 
validated  by 
operational 
analysts. 

The  FF  analytic  has 
been  tested  on 
made-up  data. 

- Analytic  designed  for  precise 
geolocation  data  (e.g.,  from 
smartphones) 

- MGSC  analytic  would  require 
the  analyst  to  define  a series 
of  meetings 
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Name  of  Analytic 

Summary 

Source  Data 

Architecture 

Status 

Caveats 

applies  a maximum  velocity 
movement  parameter  to 
approximate  the  time  that  an 
individual  was  at  each 
location.  Co-travelers  are 
identified  by  discovering  pairs 
of  selectors  that  meet 
duration  and  distance 
thresholds  set  by  the  analyst. 

Identifies  clusters  of 

data 

Cloud-based 

Tested  onJ^^Hi 

- Analytic  designed  for 

waypoints  for  each  target 

from 

Hadoop 

data  from 

point  data 

handset.  Identifies  which  pairs 

CULTWEAVE  via 

MapReduce 

CULTWEAVE. 

PACT  NGA-NSA 
CATC  Analytic 

of  handsets  contain  similar 
waypoint  clusters.  Pairs  are 
scored  based  on  the  number 

ICReach  (e.g. 
~5M  locations 
over  6 years  for 

framework 

Patterns  stored  in 
QFD. 

In  process  of 

of  waypoint  clusters  that 

200K 

transitioning  PACT 

match. 

locations  per 
day) 

to  NSA/S2. 

Analytic  accepts  a tasked  or 

- In  testing 

Cloud-based 

- FASCIA  PCS 

- Analytic  cannot  recover 

untasked  selector  and 

phase  to  be 

MapReduce 

Sortinglead 

cross-network  co-travelers 

timeframe  as  input,  and  then 

replacement 

Summaries 

- Analytic  will  not  be  effective 

derives  an  itinerary  for  the 

back-end  for  the 

against  stationary  (non- 

selector that  includes  the  CELL 

current 

traveling)  targets 

R6  SORTINGLEAD 

IDs  and/or  VLRs.  The  itinerary 

production 

- Processing  is  memory 

Co-Traveler 

is  based  on  a series  of 

CHALKFUN  co- 

intensive 

Analytic 

waypoints.  The  analytic 

traveler  tool 

- Analytic  is  sensitive  to  large 

searches  for  other  selectors 

cells,  VLRs,  and  dense  areas 

that  were  "near"  these 

- Not  directly  applicable  to  sat 

waypoints  in  space  and  time. 

phones  with  LAT/LONG 

Candidates  are  scored  and 

information 

prioritized. 

- Results  can  be  sensitive  to 
timeframe  chosen  as  input 
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Name  of  Analytic 

Summary 

Source  Data 

Architecture 

Status 

Caveats 

(not  effective  for  large  queries 
across  multiple  countries  and 
large  time  frames) 

RT-RG  Sidekicks 

(TS//SI/REL  TO  USA,  FVEY). 
This  analytic  computes 
"movement  summaries"  of 
tasked  selectors.  These  are 
lists  of  locations  that  a target 
visited  during  the  timeframe 
of  interest.  Then,  the  system 
discovers  pairs  of  targets  that 
could  be  traveling  together  by 
comparing  their  movement 
summaries,  factoring  out  pairs 
that  could  not  have 
reasonably  arrived  at  the 
meeting  waypoints  within  10 
minutes  of  each  other. 
Because  this  analytic 
considers  physical  (LAT/LONG) 
locations  and  travel  velocities, 
it  can  provide  co-traveler 
results  that  include  selectors 
on  different  provider 
networks. 

- Currently 
running  on  RT- 

RG  BHHii 

- Could  possibly 

scale  to  FASCIA 

event 

summaries 

Cloud-based 

- QFD  available  at 
RT-RG  analyst 
desktop. 

- RT-RG  Tools: 
Goldminer,  CHET, 
GEOT 

- Requires  pre-computing  all 
selectors  against  all  selectors, 
which  can  be  expensive 

- Current  output  includes  only 
tasked  selectors 

- Analytic  is  not  designed  for 
stationary  targets 

Scalable  Analytics 
Tradecraft  Center 
(SATC)  Geospatial 
Lifelines  Co- 
Travel  QFD 

This  QFD  first  generates 
geohashes  using  GSM  event 
data,  and  then  calculates 
transition  lines  indicating  that 
a device  traveled  from  one 
geohash  to  another. 

The  likeliness  of  co-travel  is 
based  on  dwell  times  at  travel 

- Geohashes  of 
GSM  event  data 
retrieved  from 
FASCIA. 

Analytic  tested  on 
90  days  of  GSM 
event  data  from 

■ 

Code  is  available 
through  SATC,  but 
analytic  is  no 

- Analytic  designed  for  GSM 
data,  but  could  be  applied  to 
other  types  of  data 

- Oriented  to  targets  that 
remain  in  one  location  for  at 
least  2 hours 

- Requires  Geocoded  source 
data  for  generating 
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Name  of  Analytic 

Summary 

Source  Data 

Architecture 

Status 

Caveats 

endpoints,  and  the  following 
measurements:  net  distance, 
time  of  transition  (mins), 
speed  (kph),  Azimuth,  and 
number  of  travel  segments. 

longer  under 
development. 

Geohashes 

SSG  Common 
IMSIs  Analytic 

This  SEDB  JEMA  model  finds 
SIM  card  activity  seen  on  cell 
tower  panels  in  multiple 
areas. 

The  analyst  inputs  areas  of 
interest  and  time  range.  The 
analytic  returns  an  excel  file 
with  a list  of  IMSIs  seen  in 
those  areas  at  that  time, 
enriched  with  OCTAVE  tasking 
information. 

OCTAVE  and 
FASCIA  data 

Tower  QFD 

Operational, 
available  in  JEMA. 

- Cell  tower  locations  in 
OCTSKYWARD  can  be 
imprecise. 

- The  SEDB  Tower  QFD 
summarizes  IMSIs  by  LAIC  by 
day. 

- Summaries  by  MSISDN  or 
IMEI  are  not  available. 

Target  Analysis 
Center 
(TAC)/Cafe/ 
Travel  and 
Mobility  Analysis 
Center  (TMAC) 
DNI  Co-Travel 
Analytic 

Discovers  candidate  co- 
travelers based  on  how  many 
times  selectors  were  seen  in 
the  same  countries  and  cities 
during  the  same  months  as 
tasked  targets.  Locations  are 
given  by  DNI  selector  IP 
geolocation,  provided  by  ASDF 
enriched  with  GEO  reference 
data. 

-Tasked  DNI 
selectors  (UTT) 

- Geotagged 
ASDF  data 

- User-provided 
travel  patterns 

Cloud-based 

GM-PLACE 

Available  to 
developers  with 
access  to 
Ghostmachine 
(GM-PLACE) 

- Tasked  targets  provided  as 
input;  results  include  tasked 
and  untasked  targets 

- Analytic  operates  at  the 
country  level,  and  designed  to 
provide  monthly  QFD  roll-up 

- Proxies  can  make  IP 
resolution  challenging 

TAC/Cafe/  TMAC 
DNR  Co-Traveler 
Analytic 

(TS//SI/REL  TO  USA,  FVEY)  The 
DNR  cloud-based  analytic 
considers  all  known  targets 
(tasked  in  OCTAVE)  that  have 
traveled  within  a given  month, 
and  attempts  to  find  their  co- 
travelers. Co-travelers  are 

- FASCIA  data  on 
Ghostmachine 

- 40. 7B  rows  in 
the  CLOUDBASE 
table 

- CHALKFUN 
Enrichment  (VLR 

Cloud-based 

GM-PLACE 

Under 

development 

- Analytic  only  considers 
tasked  selectors  as  seeds. 
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Name  of  Analytic 

Summary 

Source  Data 

Architecture 

Status 

Caveats 

defined  as  individuals  that 
were  seen  in  the  same  area 
(defined  by  Country,  VLR,  or 
Cell  ID)  around  the  same  time 
as  the  targets.  The  output 
includes  both  tasked  and 
untasked  selectors  as  possible 
co-travelers  with  the  tasked 
seeds. 

Country 

mapping) 

-CLOUDBASE 

Events 

(IMSIJMEI) 

rounded  to 

nearest  hour 
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